CloMet: A Novel Open-Source and Modular Software Platform That Connects Established Metabolomics Repositories and Data Analysis Resources

The field of metabolomics has witnessed the development of hundreds of computational tools, but only a few have become cornerstones of this field. While MetaboLights and Metabolomics Workbench are two well-established data repositories for metabolomics data sets, Workflows4Metabolomics and MetaboAnalyst are two well-established web-based data analysis platforms for metabolomics. Yet, the raw data stored in the aforementioned repositories lack standardization in terms of the file system format used to store the associated acquisition files. Consequently, it is not straightforward to reuse available data sets as input data in the above-mentioned data analysis resources, especially for non-expert users. This paper presents CloMet, a novel open-source modular software platform that contributes to standardization, reusability, and reproducibility in the metabolomics field. CloMet, which is available through a Docker file, converts raw and NMR-based metabolomics data from MetaboLights and Metabolomics Workbench to a file format that can be used directly either in MetaboAnalyst or in Workflows4Metabolomics. We validated both CloMet and the output data using data sets from these repositories. Overall, CloMet fills the gap between well-established data repositories and web-based statistical platforms and contributes to the consolidation of a data-driven perspective of the metabolomics field by leveraging and connecting existing data and resources.


Figure
Figure S1 (a).Individual spectra (n = 46) from MetaboLights study MTBLS326 after being read and

Figure
Figure S2 (a).Individual spectra (n = 10) from MetaboLights study MTBLS431 after being read and

Figure
Figure S3 (a).Individual spectra (n = 71) from MetaboLights study MTBLS869 after being read and

Figure S4 .
Figure S4.ROC curves for the discrimination of control and bacterial samples from the MetaboLights study with id MTBLS563.ROC curve analysis was generated by Monte-Carlo cross-validation using balanced subsampling and performed based on PLS-DA.Of note, control sample 1 and bacterial sample 26 were considered outliers based on an exploratory sPLS-DA model and therefore were discarded prior to the classification analysis.Also, samples were normalized by median, and features were log-transformed and mean-centered.

Figure S5 .
Figure S5.ROC curves for the discrimination of control and viral samples from the MetaboLights study with id MTBLS563.ROC curve analysis was generated by Monte-Carlo cross-validation using balanced subsampling and performed based on PLS-DA.Of note, control samples 1 and 55 were considered outliers based on an exploratory sPLS-DA model and therefore were discarded prior to the classification analysis.Also, samples were normalized by median, and features were log-transformed and mean-centered.